Poisson binomial distribution (poisson_binom)#

The Poisson binomial distribution is the distribution of the sum of independent Bernoulli random variables with different success probabilities.

If \(X_i \sim \mathrm{Bernoulli}(p_i)\) independently for \(i=1,\dots,n\), then

\[X = \sum_{i=1}^n X_i \sim \mathrm{PoissonBinomial}(p_1,\dots,p_n).\]

Learning goals#

  • Recognize Poisson binomial data and typical modeling patterns.

  • Write the PMF and CDF, and understand the generating-function view.

  • Compute mean/variance/skewness/kurtosis via additivity of cumulants.

  • Implement PMF/CDF and sampling (NumPy-only) and visualize behavior.

  • Use scipy.stats.poisson_binom for probability calculations and hypothesis tests.

Prerequisites#

  • Expectation and variance (linearity and independence)

  • Comfort with logs, products, and basic numerical computing

Notebook roadmap#

  1. Title & Classification

  2. Intuition & Motivation

  3. Formal Definition

  4. Moments & Properties

  5. Parameter Interpretation

  6. Derivations (Expectation, Variance, Likelihood)

  7. Sampling & Simulation (NumPy-only)

  8. Visualization (PMF, CDF, Monte Carlo)

  9. SciPy Integration

  10. Statistical Use Cases

  11. Pitfalls

  12. Summary

import numpy as np

import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots

from scipy import stats
from scipy.optimize import minimize
from scipy.special import expit, xlogy

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

np.set_printoptions(precision=4, suppress=True)

SEED = 7
rng = np.random.default_rng(SEED)
import sys
import scipy
import plotly

print("Python:", sys.version.split()[0])
print("NumPy:", np.__version__)
print("SciPy:", scipy.__version__)
print("Plotly:", plotly.__version__)
print("Seed:", SEED)
Python: 3.12.9
NumPy: 1.26.2
SciPy: 1.15.0
Plotly: 6.5.2
Seed: 7

1) Title & Classification#

  • Name: poisson_binom (Poisson binomial distribution)

  • Type: Discrete

  • Support: \(k \in \{0,1,\dots,n\}\) where \(n = \lvert\mathbf p\rvert\) is the number of Bernoulli trials

  • Parameter space:

    • \(\mathbf p = (p_1,\dots,p_n)\) with \(0 \le p_i \le 1\) for all \(i\).

    • (SciPy) an integer shift loc is also available; it shifts the support to \(\{\mathrm{loc}, \dots, \mathrm{loc}+n\}\).

Notation:

  • \(X \sim \mathrm{PoissonBinomial}(p_1,\dots,p_n)\).

  • Sometimes written as \(X \sim \mathrm{PB}(\mathbf p)\).

2) Intuition & Motivation#

What this distribution models#

You observe a count of successes across \(n\) Bernoulli trials, but the trials are heterogeneous:

  • Trial \(i\) succeeds with probability \(p_i\).

  • Trials are independent, but the \(p_i\) are not necessarily equal.

So \(X\) is a heterogeneous version of a binomial count.

Typical real-world use cases#

  • Reliability engineering: number of component failures when components have different failure probabilities.

  • Marketing / product analytics: number of conversions when each user/session has a different conversion probability.

  • Elections / forecasting: number of wins across districts with different win probabilities.

  • Multiple testing: number of discoveries when each test has a different Type-I error probability (or power).

Relations to other distributions#

  • Binomial: if all \(p_i\) are equal to a common \(p\), then \(X \sim \mathrm{Binomial}(n,p)\).

  • Poisson approximation: if all \(p_i\) are small and \(\lambda = \sum_i p_i\) is moderate, then \(X \approx \mathrm{Poisson}(\lambda)\).

  • Normal approximation (CLT): for large \(n\) with non-degenerate variance, \(X\) is approximately normal with mean \(\sum_i p_i\) and variance \(\sum_i p_i(1-p_i)\).

  • Sum of independent Bernoulli: conceptually, Poisson binomial is the distribution of that sum.

3) Formal Definition#

Let \(X_1,\dots,X_n\) be independent with

\[\mathbb P(X_i = 1) = p_i, \qquad \mathbb P(X_i = 0) = 1-p_i.\]

Define

\[X = \sum_{i=1}^n X_i.\]

PMF#

For \(k \in \{0,1,\dots,n\}\),

\[\mathbb P(X = k) = \sum_{A \subseteq \{1,\dots,n\}:\, |A|=k} \ \prod_{i\in A} p_i \prod_{j\notin A} (1-p_j).\]

A very useful equivalent view is via the probability generating function (PGF):

\[G_X(z) = \mathbb E[z^X] = \prod_{i=1}^n \big((1-p_i) + p_i z\big).\]

Then \(\mathbb P(X=k)\) is exactly the coefficient of \(z^k\) in \(G_X(z)\):

\[\mathbb P(X=k) = [z^k] \prod_{i=1}^n \big((1-p_i) + p_i z\big).\]

CDF#

The CDF is the partial sum of the PMF:

\[F(k) = \mathbb P(X \le k) = \sum_{j=0}^{\lfloor k\rfloor} \mathbb P(X=j).\]

There is no single closed form for general \((p_1,\dots,p_n)\), but it is efficiently computable.

def validate_p_vector(p) -> np.ndarray:
    p = np.asarray(p, dtype=float)
    if p.ndim != 1:
        raise ValueError("p must be a 1D array-like of probabilities")
    if p.size == 0:
        raise ValueError("p must have at least one element")
    if not np.all(np.isfinite(p)):
        raise ValueError("p must be finite")
    if np.any((p < 0.0) | (p > 1.0)):
        raise ValueError("each p_i must be in [0, 1]")
    return p


def poisson_binom_support(p: np.ndarray) -> np.ndarray:
    p = validate_p_vector(p)
    return np.arange(p.size + 1)
def poisson_binom_pmf_array(p: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
    """Return (k, pmf[k]) for k=0..n using a stable O(n^2) recursion.

    Recurrence: after processing probabilities p_1..p_i,
        P_i(k) = P_{i-1}(k) (1-p_i) + P_{i-1}(k-1) p_i.
    """
    p = validate_p_vector(p)
    n = p.size

    pmf = np.zeros(n + 1, dtype=float)
    pmf[0] = 1.0

    for pi in p:
        pmf_next = np.zeros_like(pmf)
        pmf_next[0] = pmf[0] * (1.0 - pi)
        pmf_next[1:] = pmf[1:] * (1.0 - pi) + pmf[:-1] * pi
        pmf = pmf_next

    pmf = np.clip(pmf, 0.0, 1.0)
    pmf = pmf / pmf.sum()  # guard against tiny rounding drift

    k = np.arange(n + 1)
    return k, pmf


def poisson_binom_logpmf(k, p: np.ndarray) -> np.ndarray:
    p = validate_p_vector(p)
    n = p.size

    k_arr = np.asarray(k)
    out = np.full(k_arr.shape, -np.inf, dtype=float)

    k_int = k_arr.astype(int)
    valid = (k_int == k_arr) & (k_int >= 0) & (k_int <= n)
    if not np.any(valid):
        return out

    _, pmf = poisson_binom_pmf_array(p)
    out[valid] = np.log(pmf[k_int[valid]])
    return out


def poisson_binom_pmf(k, p: np.ndarray) -> np.ndarray:
    return np.exp(poisson_binom_logpmf(k, p))


def poisson_binom_cdf(x, p: np.ndarray) -> np.ndarray:
    p = validate_p_vector(p)
    n = p.size

    x_arr = np.asarray(x)
    out = np.zeros_like(x_arr, dtype=float)

    out[x_arr >= n] = 1.0
    inside = (x_arr >= 0) & (x_arr < n)
    if np.any(inside):
        k = np.floor(x_arr[inside]).astype(int)
        _, pmf = poisson_binom_pmf_array(p)
        cdf = np.cumsum(pmf)
        out[inside] = cdf[k]

    return out

4) Moments & Properties#

Because \(X\) is a sum of independent Bernoulli variables, many properties follow from additivity.

Mean and variance#

Using linearity of expectation and independence:

\[\mathbb E[X] = \sum_{i=1}^n p_i,\qquad \mathrm{Var}(X) = \sum_{i=1}^n p_i(1-p_i).\]

Higher standardized moments#

A convenient way to get skewness/kurtosis is via cumulants. For a Bernoulli\((p)\) variable, the first four cumulants are:

(34)#\[\begin{align} \kappa_1 &= p \\ \kappa_2 &= p(1-p) \\ \kappa_3 &= p(1-p)(1-2p) \\ \kappa_4 &= p(1-p)(1 - 6p + 6p^2). \end{align}\]

For independent sums, cumulants add: \(\kappa_r(X) = \sum_i \kappa_r(X_i)\).

Then

(35)#\[\begin{align} \text{skewness} \;\gamma_1 &= \frac{\kappa_3}{\kappa_2^{3/2}}, \\ \text{excess kurtosis} \;\gamma_2 &= \frac{\kappa_4}{\kappa_2^2}. \end{align}\]

MGF and characteristic function#

The moment generating function (MGF) and characteristic function factorize:

(36)#\[\begin{align} M_X(t) &= \mathbb E[e^{tX}] = \prod_{i=1}^n \big((1-p_i) + p_i e^t\big), \\ \varphi_X(\omega) &= \mathbb E[e^{i\omega X}] = \prod_{i=1}^n \big((1-p_i) + p_i e^{i\omega}\big). \end{align}\]

Entropy#

There is no simple closed form in general. Given the PMF, entropy (in nats) is

\[H(X) = -\sum_{k=0}^n \mathbb P(X=k) \log \mathbb P(X=k).\]
def poisson_binom_cumulants(p: np.ndarray) -> dict:
    p = validate_p_vector(p)
    k1 = float(np.sum(p))
    k2 = float(np.sum(p * (1.0 - p)))
    k3 = float(np.sum(p * (1.0 - p) * (1.0 - 2.0 * p)))
    k4 = float(np.sum(p * (1.0 - p) * (1.0 - 6.0 * p + 6.0 * p**2)))
    return {"k1": k1, "k2": k2, "k3": k3, "k4": k4}


def poisson_binom_moments(p: np.ndarray) -> dict:
    ks = poisson_binom_cumulants(p)
    mean = ks["k1"]
    var = ks["k2"]

    if var == 0.0:
        skew = float("nan")
        excess_kurt = float("nan")
    else:
        skew = ks["k3"] / (var ** 1.5)
        excess_kurt = ks["k4"] / (var ** 2)

    return {
        "mean": mean,
        "var": var,
        "skew": skew,
        "excess_kurt": excess_kurt,
    }


def poisson_binom_log_mgf(t, p: np.ndarray) -> np.ndarray:
    """log M_X(t) = sum log((1-p_i) + p_i e^t)."""
    p = validate_p_vector(p)
    t = np.asarray(t, dtype=float)

    # logaddexp(log(1-p), log(p)+t) is stable at p=0 or p=1
    log_terms = np.logaddexp(np.log1p(-p), np.log(p) + t[..., None])
    return np.sum(log_terms, axis=-1)


def poisson_binom_mgf(t, p: np.ndarray) -> np.ndarray:
    return np.exp(poisson_binom_log_mgf(t, p))


def poisson_binom_cf(omega, p: np.ndarray) -> np.ndarray:
    p = validate_p_vector(p)
    omega = np.asarray(omega)
    return np.prod(1.0 - p + p * np.exp(1j * omega[..., None]), axis=-1)


def poisson_binom_entropy(p: np.ndarray, *, base=np.e) -> float:
    p = validate_p_vector(p)
    _, pmf = poisson_binom_pmf_array(p)

    H_nats = -float(np.sum(xlogy(pmf, pmf)))
    if base == np.e:
        return H_nats
    return H_nats / float(np.log(base))
p = np.array([0.10, 0.25, 0.60, 0.80])

k, pmf = poisson_binom_pmf_array(p)
cdf = np.cumsum(pmf)

mom = poisson_binom_moments(p)
print("p:", p)
print("support:", k)
print("pmf sum:", pmf.sum())
print("mean/var/skew/excess_kurt (from cumulants):", mom)
print("entropy (nats):", poisson_binom_entropy(p))

# Monte Carlo sanity check
samples = (rng.random((200_000, p.size)) < p).sum(axis=1)
print("MC mean:", samples.mean(), " | formula:", mom["mean"])
print("MC var :", samples.var(ddof=0), " | formula:", mom["var"])
p: [0.1  0.25 0.6  0.8 ]
support: [0 1 2 3 4]
pmf sum: 1.0
mean/var/skew/excess_kurt (from cumulants): {'mean': 1.75, 'var': 0.6775, 'skew': 0.03900275742662621, 'excess_kurt': -0.17698560749445102}
entropy (nats): 1.2220616806986953
MC mean: 1.752005  | formula: 1.75
MC var : 0.6768534799749998  | formula: 0.6775

5) Parameter Interpretation#

The parameter vector \(\mathbf p=(p_1,\dots,p_n)\) is the list of per-trial success probabilities.

  • Each \(p_i\) controls how likely trial \(i\) is to contribute a 1 to the sum.

  • The mean depends only on the sum \(\sum_i p_i\).

  • The variance depends on \(\sum_i p_i(1-p_i)\).

    • For a fixed mean, variance is maximized when the \(p_i\) are all equal (a consequence of concavity of \(p(1-p)\)).

    • Making probabilities more extreme (closer to 0 or 1) typically reduces variance and makes the distribution more concentrated.

Because the parameters are a vector, many different shapes are possible even with the same mean.

def show_pmf_comparison(ps: dict[str, np.ndarray], *, title: str):
    fig = go.Figure()
    for name, pvec in ps.items():
        k, pmf = poisson_binom_pmf_array(pvec)
        fig.add_trace(go.Scatter(x=k, y=pmf, mode="lines+markers", name=name))

    fig.update_layout(
        title=title,
        xaxis_title="k (number of successes)",
        yaxis_title="P(X = k)",
        legend_title="parameter set",
    )
    fig.show()


n = 30
mean_target = 0.30

p_equal = np.full(n, mean_target)

# Same mean, more heterogeneous probabilities
p_mixture = np.r_[np.full(n // 2, 0.10), np.full(n - n // 2, 0.50)]  # mean = 0.30
p_extreme = np.r_[np.zeros(n // 2), np.full(n - n // 2, 0.60)]        # mean = 0.30

ps = {
    "all p_i = 0.30": p_equal,
    "half 0.10, half 0.50": p_mixture,
    "half 0.00, half 0.60": p_extreme,
}

for name, pvec in ps.items():
    m = poisson_binom_moments(pvec)
    print(f"{name:>22} | mean={m['mean']:.2f}, var={m['var']:.2f}")

show_pmf_comparison(ps, title="Same mean, different heterogeneity → different variance/shape")
        all p_i = 0.30 | mean=9.00, var=6.30
  half 0.10, half 0.50 | mean=9.00, var=5.10
  half 0.00, half 0.60 | mean=9.00, var=3.60

6) Derivations#

A) Expectation#

Because \(X = \sum_i X_i\),

\[\mathbb E[X] = \sum_{i=1}^n \mathbb E[X_i] = \sum_{i=1}^n p_i.\]

B) Variance#

For independent \(X_i\),

(37)#\[\begin{align} \mathrm{Var}(X) &= \mathrm{Var}\left(\sum_i X_i\right) = \sum_i \mathrm{Var}(X_i) = \sum_i p_i(1-p_i). \end{align}\]

C) Likelihood#

Suppose you observe i.i.d. samples \(x_1,\dots,x_m\) of the sum \(X\) (each sample is a sum of \(n\) Bernoulli trials with the same probability vector \(\mathbf p\)).

The likelihood is

\[L(\mathbf p; x_{1:m}) = \prod_{j=1}^m \mathbb P_{\mathbf p}(X=x_j).\]

Equivalently, the log-likelihood is

\[\ell(\mathbf p; x_{1:m}) = \sum_{j=1}^m \log \mathbb P_{\mathbf p}(X=x_j).\]

Two important practical notes:

  • The likelihood is invariant to permuting the entries of \(\mathbf p\) (only the multiset of probabilities matters).

  • Estimating a full length-\(n\) vector \(\mathbf p\) from only the sums is often ill-posed unless \(n\) is small or you add structure/priors.

def poisson_binom_log_likelihood(p: np.ndarray, data: np.ndarray) -> float:
    p = validate_p_vector(p)
    data = np.asarray(data)

    n = p.size
    if np.any((data < 0) | (data > n) | (data != data.astype(int))):
        return -np.inf

    _, pmf = poisson_binom_pmf_array(p)
    logpmf = np.log(pmf)
    return float(np.sum(logpmf[data.astype(int)]))


# A single-observation likelihood surface (n=2)
# This is small enough that we can visualize log L(p1,p2) on a grid.

x_obs = 1
p1_grid = np.linspace(0.01, 0.99, 80)
p2_grid = np.linspace(0.01, 0.99, 80)

LL = np.empty((p1_grid.size, p2_grid.size))
for i, p1 in enumerate(p1_grid):
    for j, p2 in enumerate(p2_grid):
        LL[i, j] = poisson_binom_log_likelihood(np.array([p1, p2]), np.array([x_obs]))

fig = go.Figure(
    data=go.Heatmap(
        x=p2_grid,
        y=p1_grid,
        z=LL,
        colorscale="Viridis",
        colorbar_title="log L",
    )
)
fig.update_layout(
    title=f"Log-likelihood surface for a single observation x={x_obs} (n=2)",
    xaxis_title="p2",
    yaxis_title="p1",
)
fig.show()

7) Sampling & Simulation (NumPy-only)#

The most direct sampler matches the definition:

  1. For each trial \(i\), draw \(X_i \sim \mathrm{Bernoulli}(p_i)\).

  2. Return \(X = \sum_i X_i\).

This is \(\mathcal O(n)\) work per sample and is typically fast in NumPy due to vectorization.

Below is a NumPy-only implementation using uniforms: \(X_i = \mathbb 1\{U_i < p_i\}\) with \(U_i \sim \mathrm{Uniform}(0,1)\).

def sample_poisson_binom_numpy(p: np.ndarray, size=1, *, rng: np.random.Generator) -> np.ndarray:
    p = validate_p_vector(p)
    n = p.size

    if isinstance(size, (int, np.integer)):
        size_tuple = (int(size),)
    else:
        size_tuple = tuple(size)

    u = rng.random((*size_tuple, n))
    return (u < p).sum(axis=-1)


p = np.array([0.10, 0.25, 0.60, 0.80])

s = sample_poisson_binom_numpy(p, size=10, rng=rng)
print("samples:", s)
samples: [3 2 2 2 1 1 3 2 1 2]

8) Visualization#

We’ll visualize:

  • the exact PMF (via dynamic programming),

  • the exact CDF,

  • and Monte Carlo samples from the NumPy-only sampler.

p = np.array([0.05, 0.10, 0.20, 0.35, 0.55, 0.70, 0.85])

k, pmf = poisson_binom_pmf_array(p)
cdf = np.cumsum(pmf)

mc = sample_poisson_binom_numpy(p, size=100_000, rng=rng)
emp = np.bincount(mc, minlength=p.size + 1) / mc.size

fig = make_subplots(rows=1, cols=2, subplot_titles=("PMF", "CDF"))

fig.add_trace(go.Bar(x=k, y=pmf, name="Exact PMF", opacity=0.65), row=1, col=1)
fig.add_trace(go.Scatter(x=k, y=emp, mode="markers", name="MC empirical PMF"), row=1, col=1)

fig.add_trace(go.Scatter(x=k, y=cdf, mode="lines+markers", name="Exact CDF"), row=1, col=2)

fig.update_layout(
    title=f"Poisson binomial with n={p.size} (heterogeneous p_i)",
    bargap=0.2,
)

fig.update_xaxes(title_text="k", row=1, col=1)
fig.update_yaxes(title_text="P(X=k)", row=1, col=1)

fig.update_xaxes(title_text="k", row=1, col=2)
fig.update_yaxes(title_text="P(X≤k)", row=1, col=2, range=[0, 1.02])

fig.show()

9) SciPy Integration#

SciPy (v1.15+) includes scipy.stats.poisson_binom.

Key methods:

  • pmf, logpmf

  • cdf, sf (survival function; often numerically better for upper tails)

  • rvs (sampling)

  • stats(..., moments='mvsk') (mean/var/skew/excess kurtosis)

  • entropy

We’ll compare SciPy’s PMF to our NumPy recursion and show typical usage.

About fitting: SciPy’s generic scipy.stats.fit does not currently support poisson_binom because the shape parameter p is vector-valued.

About fitting: poisson_binom has a vector-valued parameter p, and generic MLE fitting utilities are not currently designed for that case. In practice, you usually:

  • treat \(\mathbf p\) as known/estimated from other data, and then use the distribution for inference on the sum, or

  • impose structure on \(\mathbf p\) (e.g., a low-dimensional model) and fit that model.

p = np.array([0.05, 0.10, 0.20, 0.35, 0.55, 0.70, 0.85])

rv = stats.poisson_binom(p)

k = np.arange(p.size + 1)

pmf_scipy = rv.pmf(k)
_, pmf_numpy = poisson_binom_pmf_array(p)

print("max |pmf_scipy - pmf_numpy| =", float(np.max(np.abs(pmf_scipy - pmf_numpy))))


cdf_scipy = rv.cdf(k)
cdf_numpy = poisson_binom_cdf(k, p)
print("max |cdf_scipy - cdf_numpy| =", float(np.max(np.abs(cdf_scipy - cdf_numpy))))

mean, var, skew, excess_kurt = rv.stats(moments="mvsk")
print("SciPy mean/var/skew/excess_kurt:", float(mean), float(var), float(skew), float(excess_kurt))
print("NumPy mean/var/skew/excess_kurt:", poisson_binom_moments(p))

print("SciPy entropy (nats):", float(rv.entropy()))

# Sampling
r = rv.rvs(size=10_000, random_state=rng)
print("sample mean (SciPy rvs):", r.mean())
max |pmf_scipy - pmf_numpy| = 5.551115123125783e-17
max |cdf_scipy - cdf_numpy| = 1.1102230246251565e-16
SciPy mean/var/skew/excess_kurt: 2.8 1.1099999999999999 0.06926288077113398 -0.1184562941320082
NumPy mean/var/skew/excess_kurt: {'mean': 2.8, 'var': 1.1099999999999999, 'skew': 0.06926288077112831, 'excess_kurt': -0.1184562941319698}
SciPy entropy (nats): 1.4700953431353814
sample mean (SciPy rvs): 2.808
# Attempting to use scipy.stats.fit (as of SciPy 1.15)
#
# poisson_binom has a vector-valued shape parameter p, and SciPy's generic fitter
# isn't currently set up to optimize vector-valued shape parameters.

import warnings

p_true = np.array([0.2, 0.5, 0.7])
data = stats.poisson_binom(p_true).rvs(size=500, random_state=rng)

with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    try:
        fit_res = stats.fit(stats.poisson_binom, data)
        print(fit_res)
    except Exception as e:
        print("scipy.stats.fit is not currently supported for poisson_binom (vector-valued p).")
        print("Error:", type(e).__name__, "-", e)
scipy.stats.fit is not currently supported for poisson_binom (vector-valued p).
Error: RuntimeError - The map-like callable must be of the form f(func, iterable), returning a sequence of numbers the same length as 'iterable'
# Optional: demonstrate a simple MLE when n is small (educational)
#
# Warning: estimating a full vector p from only sum data is often ill-posed.
# Here we fit a small n and show that recovery is up to permutation.

p_true = np.array([0.15, 0.35, 0.60, 0.85])
rv_true = stats.poisson_binom(p_true)

data = rv_true.rvs(size=2_000, random_state=rng)


def neg_log_likelihood_unconstrained(q: np.ndarray) -> float:
    # unconstrained q in R^n -> p in (0,1) via logistic transform
    p = expit(q)
    ll = poisson_binom_log_likelihood(p, data)
    return -ll if np.isfinite(ll) else 1e30


# initialize near a binomial approximation: all p_i equal to sample_mean / n
p0 = np.full_like(p_true, data.mean() / p_true.size)
q0 = np.log(p0) - np.log1p(-p0)

res = minimize(neg_log_likelihood_unconstrained, q0, method="BFGS")
p_hat = expit(res.x)

print("converged:", res.success)
print("true p (sorted):", np.sort(p_true))
print("mle  p (sorted):", np.sort(p_hat))
print("note: order is not identifiable; only the multiset matters")
converged: True
true p (sorted): [0.15 0.35 0.6  0.85]
mle  p (sorted): [0.4896 0.4896 0.4896 0.4896]
note: order is not identifiable; only the multiset matters

10) Statistical Use Cases#

A) Hypothesis testing (tail probabilities)#

If you know/assume the probabilities \(\mathbf p\), you can test whether an observed count \(x\) is unusually large/small.

Example: under the null model \(X \sim \mathrm{PB}(\mathbf p)\),

  • upper-tail p-value: \(\mathbb P(X \ge x) = \mathrm{sf}(x-1)\)

  • lower-tail p-value: \(\mathbb P(X \le x) = \mathrm{cdf}(x)\)

B) Bayesian modeling (uncertain probabilities)#

If each \(p_i\) is uncertain (e.g., you have a posterior over \(p_i\)), the posterior predictive distribution of the sum is a mixture of Poisson binomials.

A simple approach is Monte Carlo:

  1. sample \(\mathbf p^{(s)}\) from the posterior,

  2. compute the Poisson binomial PMF for each draw,

  3. average the PMFs.

C) Generative modeling#

Poisson binomial is a natural building block for generative models that produce counts from heterogeneous binary events (e.g., conversions, failures, wins).

# A) Hypothesis test example: is an observed count unusually high?

p = np.array([0.02, 0.05, 0.08, 0.10, 0.15, 0.20, 0.25, 0.30])
rv = stats.poisson_binom(p)

x_obs = 5
p_value_upper = rv.sf(x_obs - 1)

print("Observed x =", x_obs)
print("Upper-tail p-value P(X >= x) =", float(p_value_upper))
Observed x = 5
Upper-tail p-value P(X >= x) = 0.00132964999999996
# B) Bayesian predictive example (Monte Carlo mixture)
#
# Suppose each p_i has a Beta prior and we observed (s_i successes out of m_i trials)
# in some historical data. We want the predictive distribution for the *next* round
# of n heterogeneous Bernoulli events.

from scipy.stats import beta

rng2 = np.random.default_rng(SEED)

n = 12

a0, b0 = 1.0, 1.0  # uniform Beta prior
m_i = rng2.integers(20, 80, size=n)

# synthetic historical success counts
p_latent = rng2.uniform(0.05, 0.8, size=n)
s_i = rng2.binomial(m_i, p_latent)

# posterior is Beta(a0+s_i, b0+m_i-s_i)
a_post = a0 + s_i
b_post = b0 + m_i - s_i

S = 5_000  # posterior draws
pmf_accum = np.zeros(n + 1)

for _ in range(S):
    p_draw = beta.rvs(a_post, b_post, random_state=rng2)
    _, pmf_draw = poisson_binom_pmf_array(p_draw)
    pmf_accum += pmf_draw

pmf_pred = pmf_accum / S
k = np.arange(n + 1)

fig = go.Figure(go.Bar(x=k, y=pmf_pred))
fig.update_layout(
    title="Posterior predictive distribution of the sum (mixture of Poisson binomials)",
    xaxis_title="k (successes)",
    yaxis_title="Predictive probability",
)
fig.show()

11) Pitfalls#

  • Invalid parameters: each \(p_i\) must lie in \([0,1]\). Values very close to 0/1 can make the distribution nearly degenerate.

  • Dependence: Poisson binomial assumes independent trials. Positive correlation typically inflates variance relative to the model.

  • Computing the PMF:

    • Naively summing over subsets is \(\mathcal O(2^n)\) and infeasible.

    • The dynamic-programming recursion is \(\mathcal O(n^2)\) and is fine for moderate \(n\).

    • For large \(n\), FFT-based methods can be much faster (SciPy uses fast algorithms internally).

  • Tail probabilities: prefer sf over 1-cdf for upper tails to reduce catastrophic cancellation.

  • Fitting: estimating a full vector \(\mathbf p\) from only sum observations is typically ill-posed; consider adding structure (GLM, hierarchical priors) or using additional data.

  • Serialization: SciPy’s poisson_binom instances currently do not support pickling/unpickling.

12) Summary#

  • poisson_binom models the number of successes in independent, non-identical Bernoulli trials.

  • The PMF is the coefficient of a polynomial / PGF; dynamic programming computes it in \(\mathcal O(n^2)\) time.

  • Mean and variance are simple sums: \(\sum p_i\) and \(\sum p_i(1-p_i)\); higher cumulants add as well.

  • NumPy-only simulation is straightforward: sample each Bernoulli and sum.

  • SciPy’s scipy.stats.poisson_binom provides PMF/CDF/SF/RVS/stats/entropy and is the go-to tool for inference on the sum.